微调可能容易受到对抗攻击的影响。现有有关对微调模型(BAFT)的黑盒攻击的作品受到强有力的假设的限制。为了填补空白,我们提出了两个新型的BAFT设置,即跨域和跨域交叉结构BAFT,这仅假设(1)攻击的目标模型是微调模型,以及(2)源域数据是已知和可访问的。为了成功攻击两种设置下的微调模型,我们建议先训练针对源模型的对抗发电机,该模型采用编码器架构体系结构并将干净的输入映射到对抗性示例。然后,我们在对抗发电机的编码器产生的低维潜在空间中搜索。搜索是根据从源模型获得的替代梯度的指导进行的。对不同域和不同网络体系结构的实验结果表明,提出的攻击方法可以有效,有效地攻击微调模型。
translated by 谷歌翻译
Recent years have witnessed an astonishing explosion in the evolution of mobile applications powered by AI technologies. The rapid growth of AI frameworks enables the transition of AI technologies to mobile devices, significantly prompting the adoption of AI apps (i.e., apps that integrate AI into their functions) among smartphone devices. In this paper, we conduct the most extensive empirical study on 56,682 published AI apps from three perspectives: dataset characteristics, development issues, and user feedback and privacy. To this end, we build an automated AI app identification tool, AI Discriminator, that detects eligible AI apps from 7,259,232 mobile apps. First, we carry out a dataset analysis, where we explore the AndroZoo large repository to identify AI apps and their core characteristics. Subsequently, we pinpoint key issues in AI app development (e.g., model protection). Finally, we focus on user reviews and user privacy protection. Our paper provides several notable findings. Some essential ones involve revealing the issue of insufficient model protection by presenting the lack of model encryption, and demonstrating the risk of user privacy data being leaked. We published our large-scale AI app datasets to inspire more future research.
translated by 谷歌翻译
本文报告了建立在线语言学习工具的进步,以通过使用对话系统作为对话实践伙伴为学习者提供对话体验。我们的系统可以随时适应用户的语言水平。我们还提供自动语法错误反馈,以帮助用户从错误中学习。根据我们的第一个采用者,我们的系统娱乐和有用。此外,我们将为学习技术社区提供有关语言学习和语法校正的大规模对话数据集。我们的下一步是通过使用强化学习算法使我们的系统更适应用户配置文件。
translated by 谷歌翻译
数以百万计的流浪动物在街头遭受痛苦或每天在世界各地的庇护所中被安乐死。为了更好地采用流浪动物,对流浪动物的爪子(可爱)进行评分非常重要,但是评估动物的爪子是非常劳动密集型的事情。因此,开发一种分数动物的算法引起了迫切关注的兴趣。但是,Kaggle中的数据集不仅具有图像,还具有描述图像的元数据。大多数方法基本上都集中在近年来最先进的图像回归方法上,但是没有很好的方法来处理图像的元数据。为了应对上述挑战,本文提出了一个称为PETS-SWINF的图像回归模型,该模型考虑了图像的元数据。我们的结果基于Kaggle竞争的数据集“ Petfinder.my”,表明PETS-SWINF比仅基于基于的图像模型具有优势。我们的结果表明,测试数据集上提出的模型的RMSE丢失为17.71876,但没有元数据为17.76449。提出的方法的优点是,Pets-Swinf可以考虑元数据的低阶和高阶特征,并自适应地调整图像模型和元数据模型的权重。表现很有希望,因为我们的Leadboard得分在3545支球队(金牌)中排名第15,目前在2021 Kaggle比赛中参加了挑战“ Petfinder.my”。
translated by 谷歌翻译
相对属性(RA),参考在特定属性的强度上的两个图像上的偏好,可以使由于其丰富的语义信息来实现良好的图像到图像转换。然而,基于RAS的现有工作未能调和细粒度翻译的目标以及高质量一代的目标。我们提出了一个新的模型之旅,以协调这两个目标,以获得高质量的细粒度翻译。特别是,我们同时培训了两个模块:一个发电机,它将输入图像转换为所需图像,具有相对于感兴趣的属性的平滑微妙变化;和排名由输入图像和所需图像组成的竞争偏好的排名。竞争对手的偏好是指对抗性排名过程:(1)排名师在所需属性方面认为所需图像和输入图像之间没有差异; (2)发电机欺骗排名师以相信所需图像根据需要在输入图像上改变属性。介绍了RAS成对的真实图像,以指导排名仪对仅对感兴趣的属性进行排名对。通过有效的排名,发电机将通过产生与输入图像相比,通过产生所需改变的高质量图像来“赢得”对抗游戏。两个面部图像数据集和一个鞋图像数据集的实验表明,我们的旅行实现了最先进的导致生成高保真图像,这表现出对感兴趣的属性的平滑变化。
translated by 谷歌翻译
我们提出了一种基于具有子域(CENN)的神经网络的保守能量方法,其中允许通过径向基函数(RBF),特定解决方案神经网络和通用神经网络构成满足没有边界惩罚的基本边界条件的可允许功能。与具有子域的强形式Pinn相比,接口处的损耗术语具有较低的阶数。所提出的方法的优点是效率更高,更准确,更小的近双达,而不是具有子域的强形式Pinn。所提出的方法的另一个优点是它可以基于可允许功能的特殊结构适用于复杂的几何形状。为了分析其性能,所提出的方法宫殿用于模拟代表性PDE,这些实施例包括强不连续性,奇异性,复杂边界,非线性和异质问题。此外,在处理异质问题时,它优于其他方法。
translated by 谷歌翻译
本文提出了差异性批判性生成对抗网络(DICGAN),以了解只有部分而不是整个数据集具有所需属性时用户呈现数据的分布。 Dicgan生成了满足用户期望的所需数据,并可以协助设计具有所需特性的生物产品。现有方法首先选择所需的样品,然后在选定样品上训练常规甘斯以得出用户呈现的数据分布。但是,所需数据的选择取决于整个数据集的全球知识和监督。 Dicgan介绍了一个差异评论家,该评论家从成对的偏好中学习,这些偏好是本地知识,可以在培训数据的一部分中定义。批评家是通过定义与瓦斯坦斯坦·甘(Wasserstein Gan)批评家的额外排名损失来建立的。它赋予每对样本之间的评论值差异,并具有用户喜好,并指导所需数据的生成而不是整个数据。为了获得更有效的解决方案以确保数据质量,我们将Dicgan进一步重新重新将其作为约束优化问题,基于理论上证明了我们的Dicgan的收敛性。对具有各种应用程序的各种数据集进行的广泛实验表明,我们的Dicgan在学习用户呈现的数据分布方面取得了最新的性能,尤其是在不足的所需数据和有限的监督下。
translated by 谷歌翻译
Masked image modeling (MIM) performs strongly in pre-training large vision Transformers (ViTs). However, small models that are critical for real-world applications cannot or only marginally benefit from this pre-training approach. In this paper, we explore distillation techniques to transfer the success of large MIM-based pre-trained models to smaller ones. We systematically study different options in the distillation framework, including distilling targets, losses, input, network regularization, sequential distillation, etc, revealing that: 1) Distilling token relations is more effective than CLS token- and feature-based distillation; 2) An intermediate layer of the teacher network as target perform better than that using the last layer when the depth of the student mismatches that of the teacher; 3) Weak regularization is preferred; etc. With these findings, we achieve significant fine-tuning accuracy improvements over the scratch MIM pre-training on ImageNet-1K classification, using all the ViT-Tiny, ViT-Small, and ViT-base models, with +4.2%/+2.4%/+1.4% gains, respectively. Our TinyMIM model of base size achieves 52.2 mIoU in AE20K semantic segmentation, which is +4.1 higher than the MAE baseline. Our TinyMIM model of tiny size achieves 79.6% top-1 accuracy on ImageNet-1K image classification, which sets a new record for small vision models of the same size and computation budget. This strong performance suggests an alternative way for developing small vision Transformer models, that is, by exploring better training methods rather than introducing inductive biases into architectures as in most previous works. Code is available at https://github.com/OliverRensu/TinyMIM.
translated by 谷歌翻译
In this paper, we propose a robust 3D detector, named Cross Modal Transformer (CMT), for end-to-end 3D multi-modal detection. Without explicit view transformation, CMT takes the image and point clouds tokens as inputs and directly outputs accurate 3D bounding boxes. The spatial alignment of multi-modal tokens is performed implicitly, by encoding the 3D points into multi-modal features. The core design of CMT is quite simple while its performance is impressive. CMT obtains 73.0% NDS on nuScenes benchmark. Moreover, CMT has a strong robustness even if the LiDAR is missing. Code will be released at https://github.com/junjie18/CMT.
translated by 谷歌翻译
Dataset distillation has emerged as a prominent technique to improve data efficiency when training machine learning models. It encapsulates the knowledge from a large dataset into a smaller synthetic dataset. A model trained on this smaller distilled dataset can attain comparable performance to a model trained on the original training dataset. However, the existing dataset distillation techniques mainly aim at achieving the best trade-off between resource usage efficiency and model utility. The security risks stemming from them have not been explored. This study performs the first backdoor attack against the models trained on the data distilled by dataset distillation models in the image domain. Concretely, we inject triggers into the synthetic data during the distillation procedure rather than during the model training stage, where all previous attacks are performed. We propose two types of backdoor attacks, namely NAIVEATTACK and DOORPING. NAIVEATTACK simply adds triggers to the raw data at the initial distillation phase, while DOORPING iteratively updates the triggers during the entire distillation procedure. We conduct extensive evaluations on multiple datasets, architectures, and dataset distillation techniques. Empirical evaluation shows that NAIVEATTACK achieves decent attack success rate (ASR) scores in some cases, while DOORPING reaches higher ASR scores (close to 1.0) in all cases. Furthermore, we conduct a comprehensive ablation study to analyze the factors that may affect the attack performance. Finally, we evaluate multiple defense mechanisms against our backdoor attacks and show that our attacks can practically circumvent these defense mechanisms.
translated by 谷歌翻译